Search CORE

182 research outputs found

A walk on Python-igraph

Author: Figuerola Carlos G.
Publication venue
Publication date: 17/04/2015
Field of study

[ES]Breve tutorial de la biblioteca y aplicación para programadores i-graph

Gestion del Repositorio Documental de la Universidad de Salamanca

A walk on Python-igraph

Author: G. Figuerola Carlos
Publication venue
Publication date: 17/04/2015
Field of study

Brief tutorial of the i-graph library, a tool for programmers

Retrieval of bilingual Spanish-English information by means of a standard automatic translation system

Author: Alonso-Berrocal José-Luis
G.-Figuerola Carlos
Gómez-Díaz Raquel
Zazo Ángel F.
Publication venue
Publication date: 01/01/2000
Field of study

This paper describes our participation in bilingual retrieval (queries in Spanish on documents in English), by means of an information retrieval system based on the vector model. The queries, formulated in Spanish, were translated into English by means of a commercial automatic translation system; the terms extracted from the resulting translations were filtered in order to get rid of empty words and then they were normalised by stemming. Results are poorer than those obtained through monolingual retrieval with the original queries in English slightly above 15%

E-LIS

Automatic Classification of Documents. A Case Study

Author: Figuerola Carlos G.
Publication venue
Publication date: 01/01/2013
Field of study

[ES]La clasificación de documentos consume gran cantidad de trabajo y puede llegar a ser impracticable si la cantidad de documentos es elevada. Cuando los documentos son digitales, es posible aplicar técnicas de clasificación automática. Los sistemas de clasificación automática de tipo supervisado son capaces de identificar la clase o categoría adecuada para un documento determinado, después de una fase de aprendizaje o entrenamiento, durante la cual el sistema aprende las características que definen las diferentes categorías. Se describen algunas de las técnicas más utilizadas, como los clasificadores bayesianos, así como los diferentes ajustes que pueden ser efectuados para mejorar su efectividad. Se describe una aplicación de tales técnicas en un caso real, se analizan los detalles de la implementación y se discuten los resultados.[EN]Classification of documents consumes a great amount of work and may become impractical if the number of documents is high. When documents are in digital format, one can apply automatic techniques of classification. The so called supervised automatic classification systems are able to identify the category or class to which a document must be assigned. This is achieved by means a training process, in which the system learns the key features of every class. We describe some of most used techniques, as the Bayes based classifiers, as well as the issues that we can adjust to improve their effectivity. We also describe their practical use in a real case, we analyze their implementation and results are discusse

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Gestion del Repositorio Documental de la Universidad de Salamanca

Web Page Retrieval by Combining Evidence

Author: Alonso-Berrocal José-Luis
G.-Figuerola Carlos
Rodríguez-Vázquez-de-Aldana Emilio
Zazo Ángel F.
Publication venue
Publication date: 01/01/2006
Field of study

The participation of the REINA Research Group in WebCLEF 2005 focused in the monolingual mixed task. Queries or topics are of two types: named and home pages. For both, we first perform a search by thematic contents; for the same query, we do a search in several elements of information from every page (title, some meta tags, anchor text) and then we combine the results. For queries about home pages, we try to detect using a method based in some keywords and their patterns of use. After, a re-rank of the results of the thematic contents retrieval is performed, based on Page-Rank and Centrality coeficients

E-LIS

The implications of Wikipedia for contemporary science education: Using Social Network Analysis Techniques for Automatic Organisation of Knowledge

Author: G. Figuerola Carlos
Groves Tamar
Quintanilla Miguel-A.
Publication venue
Publication date: 07/10/2015
Field of study

Wikipedia is an Open Content resource, which is constructed by a users community, and is widely employed in educational contexts by both students and teachers. Wikipedia articles have hyperlinks that connect them, so it is possible to represent Wikipedia as a network, in which the nodes are the articles and the edges are hyperlinks. In this paper we analyze a complete copy of the Spanish Wikipedia. We apply Social Networks Analysis Techniques and, more precisely, Communities Detection Techniques, in order to identify clusters of articles with similar content. As the number of clusters is relatively small we use manual analyses to detect science articles. In addition we identify the most representative scientific fields and their main features. We conclude that science articles are about 11.66 % of Spanish Wikipedia articles and that the most important clusters of scientific articles do not always coincide with classical Science disciplines. This kind of analyses contributes to understanding better Wikipedia as an educational tool

Análisis cibermétrico y visual de Twitter

Author: Alonso-Berrocal José-Luis
G. Figuerola Carlos
Zheng Zhangxian
Publication venue: Departamento de Informática y Automática. Facultad de Ciencias. Universidad de Salamanca
Publication date: 01/01/2013
Field of study

This paper try to solve the necessity of collect the profile, followers and followed of a Twitter user via API and develop a crawler application use the library Python-Twitter, with the aim of make an analysis and visualization of the Twitter users network

E-LIS

The implications of Wikipedia for contemporary science education: Using Social Network Analysis Techniques for Automatic Organisation of Knowledge

Author: Figuerola Carlos G.
Groves Tamar
Quintanilla Fisac Miguel Ángel
Publication venue
Publication date: 07/10/2015
Field of study

[EN]Wikipedia is an Open Content resource, which is constructed by a users community, and is widely employed in educational contexts by both students and teachers. Wikipedia articles have hyperlinks that connect them, so it is possible to represent Wikipedia as a network, in which the nodes are the articles and the edges are hyperlinks. In this paper we analyze a complete copy of the Spanish Wikipedia. We apply Social Networks Analysis Techniques and, more precisely, Communities Detection Techniques, in order to identify clusters of articles with similar content. As the number of clusters is relatively small we use manual analyses to detect science articles. In addition we identify the most representative scientific fields and their main features. We conclude that science articles are about 11.66 % of Spanish Wikipedia articles and that the most important clusters of scientific articles do not always coincide with classical Science disciplines. This kind of analyses contributes to understanding better Wikipedia as an educational tool

Gestion del Repositorio Documental de la Universidad de Salamanca

La cibermetría en la recuperación de información en el Web

Author: Alonso-Berrocal José-Luis
G.-Figuerola Carlos
Zazo Ángel F.
Publication venue: Editorial de la UPV
Publication date: 01/01/2002
Field of study

The exponential growth of web and distributed data characteristics, high volatility, unstructured data, redundant and highly heterogeneous, have introduced new problems in information retrieval processes. Therefore it is necessary to open new avenue of research that allow us to obtain good levels of accuracy. The papers are based on exploiting the hypertext features of the site is reaching great fame. The cybermetrics is providing many options for working with links and is offering some interesting options at this time, and much of the techniques used in the same may be useful in the processes of information retrieval on the web

E-LIS

Science and Technology in Social Networks: Twitter

Author: Alonso Berrocal José Luis
Figuerola Carlos G.
Zazo Rodríguez Ángel Francisco
Publication venue
Publication date: 01/11/2014
Field of study

[ES]El uso de Internet como fuente principal de búsqueda de información científica se ve reforzado con el uso de los redes sociales. Esta situación requiere un estudio y estandarización del contenido obtenido por esta vía. A través del estudio de los perfiles de twitter que difunden información científica se pueden identificar los temas principales y la cantidad y calidad de la información científica compartida

Gestion del Repositorio Documental de la Universidad de Salamanca